NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

ZipLLM: Efficient LLM Storage via Model-Aware Synergistic Data Deduplication and Compression

Wang, Zirui; Lan, Tingfeng; Su, Zhaoyuan; Yang, Juncheng; Cheng, Yue (May 2026, 23rd USENIX USENIX Symposium on Networked Systems Design and Implementation (NSDI '26))

Modern model hubs, such as Hugging Face, store tens of petabytes of LLMs, with fine-tuned variants vastly outnumbering base models and dominating storage consumption. Existing storage reduction techniques---such as deduplication and compression---are either LLM-oblivious or not compatible with each other, limiting data reduction effectiveness. Our large-scale characterization study across all publicly available Hugging Face LLM repositories reveals several key insights: (1) fine-tuned models within the same family exhibit highly structured, sparse parameter differences suitable for delta compression; (2) bitwise similarity enables LLM family clustering; and (3) tensor-level deduplication is better aligned with model storage workloads, achieving high data reduction with low metadata overhead. Building on these insights, we design BitX, an effective, fast, lossless delta compression algorithm that compresses XORed difference between fine-tuned and base LLMs. We build ZipLLM, a model storage reduction pipeline that unifies tensor-level deduplication and lossless BitX compression. By synergizing deduplication and compression around LLM family clustering, ZipLLM reduces model storage consumption by 54%, over 20% higher than state-of-the-art deduplication and compression approaches.
more » « less
Free, publicly-accessible full text available May 4, 2027
Adversarial Evasion through Semantics-Preserving Reinforcement Learning in Graph-Based Malware Detection

Zhang, Lan; Liu, Peng (December 2025, IntechOpen)

The deployment of deep learning-based malware detection systems has transformed cybersecurity, offering sophisticated pattern recognition capabilities that surpass traditional signature-based approaches. However, these systems introduce new vulnerabilities requiring systematic investigation. This chapter examines adversarial attacks against graph neural network-based malware detection systems, focusing on semantics-preserving methodologies that evade detection while maintaining program functionality. We introduce a reinforcement learning (RL) framework that formulates the attack as a sequential decision making problem, optimizing the insertion of no-operation (NOP) instructions to manipulate graph structure without altering program behavior. Comparative analysis includes three baseline methods: random insertion, hill-climbing, and gradient-approximation attacks. Our experimental evaluation on real world malware datasets reveals significant differences in effectiveness, with the reinforcement learning approach achieving perfect evasion rates against both Graph Convolutional Network and Deep Graph Convolutional Neural Network architectures while requiring minimal program modifications. Our findings reveal three critical research gaps: transitioning from abstract Control Flow Graph representations to executable binary manipulation, developing universal vulnerability discovery across different architectures, and systematically translating adversarial insights into defensive enhancements. This work contributes to understanding adversarial vulnerabilities in graph-based security systems while establishing frameworks for evaluating machine learning-based malware detection robustness.
more » « less
Free, publicly-accessible full text available December 1, 2026
Influence of additive–polymer interactions on the mechanical behaviors of cross-linked polymers

https://doi.org/10.1016/j.eml.2025.102403

Nie, Wenjian; Xu, Lan; Xia, Wenjie (October 2025, Extreme mechanics letters)

Free, publicly-accessible full text available October 14, 2026
Automated Statistical Testing and Certification of a Reliable Model-Coupling Server for Scientific Computing

https://doi.org/10.18293/SEKE2025-050

Wolfgang, Seth; Lin, Lan; Song, Fengguang (September 2025, KSI Research)

Free, publicly-accessible full text available September 29, 2026
Hydration-Driven Interfacial Behaviors of Nanoconfined Sodium Montmorillonite

https://doi.org/10.1021/acs.langmuir.5c04534

Chen, Long; Wu, Zhuang; Xu, Lan; Li, Zhaofan; Zhang, Yida; Xia, Wenjie (November 2025, Langmuir)

Free, publicly-accessible full text available November 13, 2026
Coming of age of plant amphisomes

https://doi.org/10.1080/15548627.2025.2589272

Lan, Hu-Jiao; Huang, Min-Jun; Bednarek, Sebastian Y; Liu, Jian-Zhong (November 2025, Autophagy)

ABSTRACT In metazoans, autophagosomes fuse with late endosomes (LEs)/multivesicular bodies (MVBs) to form a hybrid organelle known as an amphisome. Subsequently upon fusion with lysosomes the contents of amphisomes are degraded. While the formation of metazoan amphisomes has been well established, it has remained an open question whether amphisomes form and deliver their cargo to the central vacuole for degradation in plant cells. In this mini review, we provide an update on recent discoveries in the field of plant autophagy that demonstrate the formation of amphisome-like organelles that are generated through several distinct autophagosome/MVB fusion pathways.
more » « less
Free, publicly-accessible full text available November 23, 2026
Advancing nonadiabatic molecular dynamics simulations in solids with E(3) equivariant deep neural hamiltonians

https://doi.org/10.1038/s41467-025-57328-1

Zhang, Changwei; Zhong, Yang; Tao, Zhi-Guo; Qin, Xinming; Shang, Honghui; Lan, Zhenggang; Prezhdo, Oleg V; Gong, Xin-Gao; Chu, Weibin; Xiang, Hongjun (December 2025, Nature Communications)

Abstract Non-adiabatic molecular dynamics (NAMD) simulations have become an indispensable tool for investigating excited-state dynamics in solids. In this work, we propose a general framework, N²AMD (Neural-Network Non-Adiabatic Molecular Dynamics), which employs an E(3)-equivariant deep neural Hamiltonian to boost the accuracy and efficiency of NAMD simulations. Distinct from conventional machine learning methods that predict key quantities in NAMD, N²AMD computes these quantities directly with a deep neural Hamiltonian, ensuring excellent accuracy, efficiency, and consistency. N²AMD not only achieves impressive efficiency in performing NAMD simulations at the hybrid functional level within the framework of the classical path approximation (CPA), but also demonstrates great potential in predicting non-adiabatic coupling vectors and suggests a method to go beyond CPA. Furthermore, N²AMD demonstrates excellent generalizability and enables seamless integration with advanced NAMD techniques and infrastructures. Taking several extensively investigated semiconductors as the prototypical system, we successfully simulate carrier recombination in both pristine and defective systems at large scales where conventional NAMD often significantly underestimates or even qualitatively incorrectly predicts lifetimes. This framework offers a reliable and efficient approach for conducting accurate NAMD simulations across various condensed materials.
more » « less
Free, publicly-accessible full text available December 1, 2026
Learning Code-Edit Embeddings to Model Student Debugging Behavior

https://doi.org/10.1007/978-3-031-98465-5_12

Heickal, Hasnain; Lan, Andrew (July 2025, Springer Nature Switzerland)

Free, publicly-accessible full text available July 20, 2026
MFNetSim: A Multi-Fidelity Network Simulation Framework for Multi-Traffic Modeling of Dragonfly Systems

https://doi.org/10.1145/3729424

Wang, Xin; Brown, Kevin A; Ross, Robert B; Carothers, Christopher D; Lan, Zhiling (September 2025, ACM Transactions on Modeling and Computer Simulation)

In high-performance computing (HPC), modern supercomputers typically provide exclusive computing resources to user applications. Nevertheless, the interconnect network is a shared resource for both inter-node communication and across-node I/O access, among co-running workloads, leading to inevitable network interference. In this study, we develop MFNetSim, a multi-fidelity modeling framework that enables simulation of multi-traffic simultaneously over the interconnect network, including inter-process communication and I/O traffic. By combining different levels of abstraction, MFNetSim can efficiently co-model the communication and I/O traffic occurring on HPC systems equipped with flash-based storage. We conduct simulation studies of hybrid workloads composed of traditional HPC applications and emerging ML applications on a 1,056-node Dragonfly system with various configurations. Our analysis provides various observations regarding how network interference affects communication and I/O traffic.
more » « less
Free, publicly-accessible full text available September 12, 2026
Genomic divergence, demographic histories, and male territorial response reveal asymmetric reproductive barriers in allopatric eastern versus western Nashville warbler subspecies ( Leiothlypis ruficapilla )

https://doi.org/10.1093/evolut/qpaf221

Phung, Lan-Nhi; Baiz, Marcella_D; Wood, Andrew_W; Moore, Madison; Toews, David_P_L; Wagner, ed., Catherine; Connallon, ed., Tim (October 2025, Evolution)

Abstract In song-learning birds, vocalizations are species recognition signals and may act as premating reproductive barriers; for allopatric taxa, testing how the signals can influence the speciation processes is quite challenging. This study aims to understand genomic divergence and species recognition via songs in 2 allopatric taxa, eastern and western Nashville warblers (Leiothlypis ruficapilla ruficapilla vs. Leiothlypis ruficapilla ridgwayi). We performed playback experiments to assess their reciprocal behavioral responses, which suggests an asymmetric barrier: the eastern L. r. ruficapilla discriminates between the 2 songs, but the western L. r. ridgwayi does not. Using whole-genome sequencing, we also examined the extent of the taxa’s genomic divergence and estimated their demographic history. We identified dozens of highly differentiated genomic regions, as well as fluctuations in historical effective population sizes that indicate independent demographic trajectories during the Pleistocene. To contextualize the magnitude of divergence between L. ruficapilla subspecies, we applied the same genomic analyses to 2 additional eastern-western pairs of parulid warblers, Setophaga virens vs. Setophaga townsendi and Setophaga coronata coronata vs. Setophaga coronata auduboni, which have existing behavior studies but are not in strict allopatry. Our findings provide insights into the role of vocalizations in defining within-pair relationship and the important legacy of isolation during the Pleistocene.
more » « less

« Prev Next »

Search for: All records